---
layout: docs
page_title: Integrate Consul service mesh
description: |-
  Learn how to use Nomad with Consul service mesh to enable secure service-to-service communication. Review an example that enables secure communication with Consul TLS, Consul access control lists (ACLs), and a transparent proxy.
---

# Consul service mesh

## Introduction

Service mesh is a networking pattern that deploys and configures
infrastructure to directly connect workloads. One of the most common pieces of
infrastructure deployed are sidecar proxies. These proxies usually run
alongside the main workload in an isolated network namespace such that all
network traffic flows through the proxy.

The proxies are often referred to as the **data plane** since they are
responsible for _moving data_ while the components that configure them are part
of the **control plane** because they are responsible for controlling the _flow
of data_.

By funneling traffic through a common layer of infrastructure the control plane
is able to centralize and automatically apply configuration to all proxies to
enable features such as automated traffic encryption, fine-grained routing, and
service-based access control permissions throughout the entire mesh.

[Consul service mesh](/consul/docs/connect) provides
service-to-service connection authorization and encryption using mutual
Transport Layer Security (TLS). Applications can use sidecar proxies in a
service mesh configuration to automatically establish TLS connections for
inbound and outbound connections without being aware of the service mesh at all.

~> **Note:** Nomad's service mesh integration requires Linux network namespaces.
Consul service mesh will not run on Windows or macOS.

## Nomad with Consul service mesh integration

In a production environment, Nomad and Consul exist within the same datacenter.

![Reference diagram](/img/deploy/nomad_reference_diagram.png)

Nomad integrates with Consul to provide secure service-to-service communication
between Nomad jobs and task groups. To support Consul service mesh, Nomad
adds a new networking mode for jobs that enables tasks in the same task group to
share their networking stack. With a few changes to the job specification, job
authors can opt into service mesh integration. When service mesh is enabled, Nomad will
launch a proxy alongside the application in the job file. The proxy (Envoy)
provides secure communication with other applications in the cluster.

Nomad job specification authors can use Nomad's Consul service mesh integration to
implement [service segmentation](https://www.consul.io/use-cases/multi-platform-service-mesh) in a
microservice architecture running in public clouds without having to directly
manage TLS certificates. This is transparent to job specification authors as
security features in service mesh continue to work even as the application scales up
or down or gets rescheduled by Nomad.

For using the Consul service mesh integration with Consul ACLs enabled, see the
[Secure Nomad Jobs with Consul Service Mesh](/nomad/tutorials/integrate-consul/consul-service-mesh)
guide.

## Nomad Consul service mesh example

The following section walks through an example to enable secure communication
between a web dashboard and a backend counting service. The web dashboard and the counting service are
managed by Nomad. Nomad additionally configures Envoy proxies to run along side
these applications. The dashboard is configured to connect to the counting
service via localhost on port 9001. The proxy is managed by Nomad, and handles
mTLS communication to the counting service.

### Prerequisites

#### Consul

The Consul service mesh integration with Nomad requires [Consul 1.6 or
later.](https://releases.hashicorp.com/consul/1.6.0/) The Consul agent can be
run in dev mode with the following command:

~> **Note:** Nomad's Consul service mesh integration requires Consul in your `$PATH`

```shell-session
$ consul agent -dev
```

To use service mesh on a non-dev Consul agent, you will minimally need to enable the
GRPC port and set `connect` to enabled by adding some additional information to
your Consul client configurations, depending on format. Consul agents running TLS
and a version greater than [1.14.0](https://releases.hashicorp.com/consul/1.14.0)
should set the `grpc_tls` configuration parameter instead of `grpc`. Please see
the Consul [port documentation](https://developer.hashicorp.com/consul/docs/install/ports) for further reference material.

For HCL configurations:

```hcl
# ...

ports {
  grpc = 8502
}

connect {
  enabled = true
}
```

For JSON configurations:

```javascript
{
  // ...
  "ports": {
    "grpc": 8502
  },
  "connect": {
     "enabled": true
  }
}
```

##### Consul TLS

~> **Note:** Consul 1.14+ made a [backwards incompatible change][consul_grpc_tls]
in how TLS enabled grpc listeners work. When using Consul 1.14 with TLS enabled users
will need to specify additional Nomad agent configuration to work with Connect. The
`consul.grpc_ca_file` value must now be configured (introduced in Nomad 1.4.4),
and `consul.grpc_address` will most likely need to be set to use the new standard
`grpc_tls` port of `8503`.

```hcl
consul {
  grpc_ca_file = "/etc/tls/consul-agent-ca.pem"
  grpc_address = "127.0.0.1:8503"
  ca_file      = "/etc/tls/consul-agent-ca.pem"
  cert_file    = "/etc/tls/dc1-client-consul-0.pem"
  key_file     = "/etc/tls/dc1-client-consul-0-key.pem"
  ssl          = true
  address      = "127.0.0.1:8501"
}
```

##### Consul access control lists

~> **Note:** Starting in Nomad v1.3.0, Consul Service Identity ACL tokens automatically
generated by Nomad on behalf of Connect enabled services are now created in [`Local`]
rather than Global scope, and are no longer replicated globally.

To facilitate cross-Consul datacenter requests of Connect services registered by
Nomad, Consul agents will need to be configured with [default anonymous][anon_token]
ACL tokens with ACL policies of sufficient permissions to read service and node
metadata pertaining to those requests. This mechanism is described in Consul [#7414][consul_acl].
A typical Consul agent anonymous token may contain an ACL policy such as:

```hcl
service_prefix "" { policy = "read" }
node_prefix    "" { policy = "read" }
```

##### Transparent proxy

Using Nomad's support for [transparent proxy][] configures the task group's
network namespace so that traffic flows through the Envoy proxy. When the
[`transparent_proxy`][] block is enabled:

* Nomad will invoke the [`consul-cni`][] CNI plugin to configure `iptables` rules
  in the network namespace to force outbound traffic from an allocation to flow
  through the proxy.
* If the local Consul agent is serving DNS, Nomad will set the IP address of the
  Consul agent as the nameserver in the task's `/etc/resolv.conf`.
* Consul will provide a [virtual IP][] for any upstream service the workload
  has access to, based on the service intentions.

Using transparent proxy has several important requirements:

* You must have the [`consul-cni`][] CNI plugin installed on the client host
  along with the usual [required CNI plugins][cni_plugins].
* To use Consul DNS and virtual IPs, you will need to configure Consul's DNS
  listener to be exposed to the workload network namespace. You can do this
  without exposing the Consul agent on a public IP by setting the Consul
  `bind_addr` to bind on a private IP address (the default is to use the
  `client_addr`).
* The Consul agent must be configured with [`recursors`][] if you want
  allocations to make DNS queries for applications outside the service mesh.
* Your workload's task cannot use the same [Unix user ID (UID)][uid] as the
  Envoy sidecar proxy.
* You cannot set a [`network.dns`][] block on the allocation (unless you set
  [`no_dns`][tproxy_no_dns], see below).

For example, a HCL configuration with a [go-sockaddr/template][] binding to the
subnet `10.37.105.0/20`, with recursive DNS set to OpenDNS nameservers:

```hcl
bind_addr   = "{{ GetPrivateInterfaces | include \"network\" \"10.37.105.0/20\" | limit 1 | attr \"address\" }}"

recursors = ["208.67.222.222", "208.67.220.220"]
```

#### Nomad

Nomad must schedule onto a routable interface in order for the proxies to
connect to each other. The following steps show how to start a Nomad dev agent
configured for Consul service mesh.

```shell-session
$ sudo nomad agent -dev-connect
```

#### Container Network Interface (CNI) plugins

Nomad uses CNI reference plugins to configure the network namespace used to secure the
Consul service mesh sidecar proxy. All Nomad client nodes using network namespaces
must have these CNI plugins [installed][cni_install].

To use [`transparent_proxy`][] mode, Nomad client nodes will also need the
[`consul-cni`][] plugin installed. See the Linux post-installation [steps](/nomad/docs/deploy#linux-post-installation-steps) for more detail on how to install CNI plugins.

### Run the service mesh-enabled services

Once Nomad and Consul are running, with Consul DNS enabled for transparent proxy
mode as described above, submit the following service mesh-enabled services to
Nomad by copying the HCL into a file named `servicemesh.nomad.hcl` and running:
`nomad job run servicemesh.nomad.hcl`

```hcl
job "countdash" {
  datacenters = ["dc1"]

  group "api" {
    network {
      mode = "bridge"
    }

    service {
      name = "count-api"
      port = "9001"

      connect {
        sidecar_service {
          proxy {
            transparent_proxy {}
          }
        }
      }
    }

    task "web" {
      driver = "docker"

      config {
        image = "hashicorpdev/counter-api:v3"
      }
    }
  }

  group "dashboard" {
    network {
      mode = "bridge"

      port "http" {
        static = 9002
        to     = 9002
      }
    }

    service {
      name = "count-dashboard"
      port = "http"

      connect {
        sidecar_service {
          proxy {
            transparent_proxy {}
          }
        }
      }
    }

    task "dashboard" {
      driver = "docker"

      env {
        COUNTING_SERVICE_URL = "http://count-api.virtual.consul"
      }

      config {
        image = "hashicorpdev/counter-dashboard:v3"
      }
    }
  }
}
```

The job contains two task groups: an API service and a web frontend.

#### API service

The API service is defined as a task group with a bridge network:

```hcl
group "api" {
  network {
    mode = "bridge"
  }

  # ...
}
```

Since the API service is only accessible via Consul service mesh, it does not
define any ports in its network. The `connect` block enables the service mesh
and the `transparent_proxy` block ensures that the service will be reachable via
a virtual IP address when used with Consul DNS.

```hcl
group "api" {

  # ...

  service {
    name = "count-api"
    port = "9001"

    connect {
      sidecar_service {
        proxy {
          transparent_proxy {}
        }
      }
    }
  }

  # ...

}
```

The `port` in the service block is the port the API service listens on. The
Envoy proxy will automatically route traffic to that port inside the network
namespace. Note that currently this cannot be a named port; it must be a
hard-coded port value. See [GH-9907].

#### Web Frontend

The web frontend is defined as a task group with a bridge network and a static
forwarded port:

```hcl
group "dashboard" {
  network {
    mode = "bridge"

    port "http" {
      static = 9002
      to     = 9002
    }
  }

  # ...

}
```

The `static = 9002` parameter requests the Nomad scheduler reserve port 9002 on
a host network interface. The `to = 9002` parameter forwards that host port to
port 9002 inside the network namespace.

This allows you to connect to the web frontend in a browser by visiting
`http://<host_ip>:9002` as show below:

[![Count Dashboard][count-dashboard]][count-dashboard]

The web frontend connects to the API service via Consul service mesh.

```hcl
service {
  name = "count-dashboard"
  port = "http"

  connect {
    sidecar_service {
      proxy {
        transparent_proxy {}
      }
    }
  }
}
```

The `connect` block with `transparent_proxy` configures the web frontend's
network namespace to route all access to the `count-api` service through the
Envoy proxy.

The web frontend is configured to communicate with the API service with an
environment variable `$COUNTING_SERVICE_URL`:

```hcl
env {
  COUNTING_SERVICE_URL = "http://count-api.virtual.consul"
}
```

The `transparent_proxy` block ensures that DNS queries are made to Consul so
that the `count-api.virtual.consul` name resolves to a virtual IP address. Note
that you don't need to specify a port number because the virtual IP will only be
directed to the correct service port.

#### Manually configured upstreams

You can also use Connect without Consul DNS and `transparent_proxy` mode. This
approach is not recommended because it requires duplicating service intention
information in an `upstreams` block in the Nomad job specification. But Consul
DNS is not protected by ACLs, so you might want to do this if you don't want to
expose Consul DNS to untrusted workloads.

In that case, you can add `upstream` blocks to the job spec. You don't need the
`transparent_proxy` block for the `count-api` service:

```hcl
group "api" {

  # ...

  service {
    name = "count-api"
    port = "9001"

    connect {
      sidecar_service {}
    }
  }

  # ...

}
```

But you'll need to add an `upstreams` block to the `count-dashboard` service:

```hcl
service {
  name = "count-dashboard"
  port = "http"

  connect {
    sidecar_service {
      proxy {
        upstreams {
          destination_name = "count-api"
          local_bind_port  = 8080
        }
      }
    }
  }
}
```

The `upstreams` block defines the remote service to access (`count-api`) and
what port to expose that service on inside the network namespace (`8080`).

The web frontend will also need to use an environment variable to communicate
with the API service:

```hcl
env {
  COUNTING_SERVICE_URL = "http://${NOMAD_UPSTREAM_ADDR_count_api}"
}
```

This environment variable value gets interpolated with the upstream's
address. Note that dashes (`-`) are converted to underscores (`_`) in
environment variables so `count-api` becomes `count_api`.

### Envoy proxy

Consul Service Mesh uses [Envoy][] as proxy. Nomad calls Consul's [`consul
connect envoy -bootstrap`][consul_cli_envoy] CLI command to generate the
initial proxy configuration.

Nomad injects a prestart sidecar Docker task to run the Envoy proxy. This task
can be customized using the [`sidecar_task`][] block.

### Gateways

Since the mesh defines a closed boundary that only selected services can
participate in, there are specialized proxies called gateways that can be used
for mesh-wide connectivity. Nomad can deploy these gateways using the
[`gateway`][] block. Nomad injects an Envoy proxy task to any `group` with a
`gateway` service.

The types of gateways provided by Consul Service Mesh are:

- **Mesh gateways** allow communication between different service meshes and
  are deployed using the [`mesh`][] parameter.

- **Ingress gateways** allow services outside the mesh to connect to services
  inside the mesh and are deployed using the [`ingress`][] parameter.

- **Egress gateways** allow services inside the mesh to communication with
  services outside the mesh and are deployed using the [`terminating`][]
  parameter.

## Limitations

- The minimum Consul version to use Connect with Nomad is Consul v1.8.0.
- The `consul` binary must be present in Nomad's `$PATH` to run the Envoy
  proxy sidecar on client nodes.
- Consul service mesh using network namespaces is only supported on Linux.
- Prior to Consul 1.9, the Envoy sidecar proxy will drop and stop accepting
  connections while the Nomad agent is restarting.

## Troubleshooting

If the sidecar service is not running correctly, you can investigate
potential `envoy` failures in the following ways:

* Task logs in the associated `connect-*` task
* Task secrets (may contain sensitive information):
  * envoy CLI command: `secrets/.envoy_bootstrap.cmd`
  * environment variables: `secrets/.envoy_bootstrap.env`
* An extra Allocation log file: `alloc/logs/envoy_bootstrap.stderr.0`

For example, with an allocation ID starting with `b36a`:

```shell-session
nomad alloc status -short b36a  # to get the connect-* task name
nomad alloc logs -task connect-proxy-count-api -stderr b36a
nomad alloc exec -task connect-proxy-count-api b36a cat secrets/.envoy_bootstrap.cmd
nomad alloc exec -task connect-proxy-count-api b36a cat secrets/.envoy_bootstrap.env
nomad alloc fs b36a alloc/logs/envoy_bootstrap.stderr.0
```

Note: If the alloc is unable to start successfully, debugging files may
only be accessible from the host filesystem. However, the sidecar task secrets
directory may not be available in systems where it is mounted in a temporary
filesystem.

Bootstrapping the Envoy proxy requires that the Consul ACL token and service
registration have successfully replicated to whichever Consul server the local
Consul agent is connected to. Nomad clients poll for this value with exponential
backoff and a timeout. You can adjust the timeouts on a given node by setting
node metadata values via the command line or in the [`client.meta`][] agent
configuration block. The default values are shown below:

```shell-session
nomad node meta apply -node-id $nodeID \
    consul.token_preflight_check.timeout=10s \
    consul.token_preflight_check.base=500ms \
    consul.service_preflight_check.timeout=60s \
    consul.service_preflight_check.base=1s
```

[count-dashboard]: /img/count-dashboard.png
[consul_acl]: https://github.com/hashicorp/consul/issues/7414
[gh-9907]: https://github.com/hashicorp/nomad/issues/9907
[`Local`]: /consul/docs/security/acl/tokens#token-attributes
[anon_token]: /consul/docs/security/acl/tokens#special-purpose-tokens
[consul_ports]: /consul/docs/agent/config/config-files#ports
[consul_grpc_tls]: /consul/docs/upgrading/upgrade-specific#changes-to-grpc-tls-configuration
[cni_install]: /nomad/docs/deploy#linux-post-installation-steps
[transparent proxy]: /consul/docs/k8s/connect/transparent-proxy
[go-sockaddr/template]: https://pkg.go.dev/github.com/hashicorp/go-sockaddr/template
[`recursors`]: /consul/docs/agent/config/config-files#recursors
[`transparent_proxy`]: /nomad/docs/job-specification/transparent_proxy
[tproxy_no_dns]: /nomad/docs/job-specification/transparent_proxy#no_dns
[`consul-cni`]: https://releases.hashicorp.com/consul-cni
[virtual IP]: /consul/docs/services/discovery/dns-static-lookups#service-virtual-ip-lookups
[cni_plugins]: /nomad/docs/networking/cni#install-cni-reference-plugins
[consul_dns_port]: /consul/docs/agent/config/config-files#dns_port
[`network.dns`]: /nomad/docs/job-specification/network#dns-parameters
[`client.meta`]: /nomad/docs/configuration/client#meta
[uid]: /nomad/docs/job-specification/transparent_proxy#uid
